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ABSTRACT , • ^. , 

Paced with a nonstandard, complicated practical 
problem in statistical inference, the applied statistician sometimes 
must use asymptotic approximations in order to compute standard 
errors and confidence intervals and to test hypotheses. This usually 
requires that he derive formulas for one or more asymptotic sampling 
va'riances (and covariances) for one or more statistics. He must then 
compute the numerical value of an estimate of some function of- these 
variances and covariances. If a statistic is a nonlinear function of 
more than two or three sample statistics, the mathematical derivaJtion 
of the necessary variance (and covariance) formulas may be 
burdensome, or even prohibitive. TheNpurpose of the present paper is 
to call attention to computer program\ LASAHT that computes estimated 
asymptotic sampling variances and covariances numerically and carries 
out hypothesis tests without need for ^he statistician to derive 
formulas for them. (Author/RC) 
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Automated Hypothesis Tests and Standard Krrors for Nonstandard Problems 

Frederic M. Lord 
Educational Testing Service 

Introduction 

Faced with a nonstandard, complicated practical problem in statistical 
inference^ the applied statistician sometimes must use asymptotic approxima- 
tions in order to compute standard errors and confidence intervals and to 
test h;ypotheses. This usually requires that he derive formulas for one or 
more asymptotic sampling variances (and covariances) for one or more 
statistics i^/l^r** • he must then compute the numerical value of an 
estimate of some function of these variances and covariances. 

If 1^ is a nonlinear function of more tlian two or three sample 
statistics, the mathematical derivation of the necessary variance (and 
covariance) foimulas may be burdensome, or even prohibitive. The purpose 
of the present paper is to call attention to computer program LASAIIT 
that computes estimated asympioiic sampling variances and covariances 
mrnierically and carries out hj^othesis tests without need for the statis- 
tician to derive formulas r->r them. AUTEST, written by Martha Stocking, 
and instructions for its use (Ctocking and L -rd, 1975) are available 
from the authors. 

Asymptotic V^iriances and Covariances 

Let t = l(t) be a different iable function of sample statistics 
denoted by the\ector t = {t^^} = (t^(X)} . Denote the expectation of t 
by T = fr ) . If the t have variances of order n""^' , where N is the 
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L-ample t^izc, 'then i>h(> anyini toLio variance ui: ?. if finite is f^ivc-n 
(Ki-ndall htid n* uiri,, hj'Jl, eq. 10.12) by 



Var ^ = 2 £ 



d|(T) oi(r) 



Or 

u V u 



Cov(t ,t ) 



(1) 



provided (l) is nonzero 

A consist 
for T in (l.) 



A consistent estimator c-i is usiiaily^ obtained by substituting t 



ot Ot 
u V u V 



05^v(t ,t ) 



(2) 



ii/here 



A C'/v(t ;t ) = Cov(t ,t ) 



T t 



ir (1) 1.' 'i r-tiloii^J, runolioa^ c'»ncir:Len'jy follour from a proposition due 
i:> Glulr.?:- ('^r^^HK^r, 1;/- , ] . /iuil'ir rcralt it*, deduced from dif- 

fereru n^r/jrivi-n/ by (it'*/; I * )• The covariance belv;een tv/o 
functions r --iad u* ir 1 1'-.r '^'.:oI^-itod from 



0) 



(V/e v;il1 conoistontiy u.^e the n-'tation ^ rather than ^/^^^C^g^^ ^ 
t'> derioto the quantity dcTinrd b;, (:3).) 
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The coim liter obtainr> numerical values for (t)/dt directly by 
numerical differentiation. It then obtains C'/v(t^,t^) from standard 
formulas, as will be eyplained later, and proceeds directly to compute 
(2) and (5) for all k(k -i l)/?. pairs in"the set t^^^ " * * 

Asymptotic Hypothesis Testing 

In the preceding section^ v;e star.ted with some statistics t^ con- 
venient for defining the i • It .-/ill not matter what set of statistics 
> a 

ve choose so lonr/as the t are functionally independent of each other. 

^ u 

For (C) and (3), the vector t f (t^) must include all statistics needed 
for estimating all parameters in the matrix ||Cov(t^, t^)|! . For example, 
we might have - , t. ^ , arxd i|Cov(m^,m^)|| = ||a. , where m. 
denotes a sample mean and a. . a popalat^ion covariance. In this case t 

^. ^1 ^^2^ ^11^ ^12^ ^22 ' 

la the present section, we start with a set of parameters denoted by 
OD and consider an if-by-n matrix X of observations drawn from the 
distribution f (7. jco) . The ]*arameters co are assumed to be functionally 
independent of o-ich jther. V/e wish to test the composite hypothesis 
H^: e = 0 , where i = (i-j^, ip, • • • . tj^) ' ^ vector of k elements. 

L<>t i be an estimate of | . If k =^ 1 , i is simply a scalar, 
I y and can usually be tested by computing ^/a^ where cr^, is the 

asymptotic sampling variance of with t substituted for the unknown 
parameters t . The rejection region for 11^ consists of one or both 
tails of the asymptotic distribution of e/S| under . In most common 
problems, this distribution is normal with zero mean and unit variance 
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If i is a vector of k elements^ can usually be tested by 

computing Q = VC'^t > vhere C' is an estimate of C E llcov(i^; ob- 
tained from (2) and (5). The rejection region for 11^ consists of all 
large values of Q- * In most common problems^ the asymptotic distribution 
of Q is chi square with k degrees of freedom. 

Stroud (1971a) proved that the asymptotic hypothesis testing procedure 
just described is valid under the conditions that 

1. The functions i E L (t) ( a = 1, 2, • -^.^k ) have bounded and ^ 

a a *v i 

continuous second derivatives in the neighborhood of § = 0 . 

2. The vector t is asymptotically normal with mean t and 
nonsingular l|Cov(t^^, t^)li . 

IMv(t^;t^)i| is -nonsingular with probability une and con- 
verges in probability to l!Cov(t^; t^)|| . 

These conditior.s are fulfilled by a bruad class of problems^ some of which 

are illustrated in Tables 1 and r. 

If the 1 -are the maximum likelihood estimates of the 
a 

( a - 1, k ), obtained without the restriction | =^ 2 ' ^^"^^ 
tent described x'^ asymptoticaily most stringent and is also local3.y 
as-^TTintotically most po'./erful (M^ran, 1970; Wald, 19!^5)• A regularity 
condition vorth noting is (as a.lj-eady implied by condition 1 above) that 
i = 0 must not be on the boundary of the' range of s • 
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Implementation 

As presently vrritteri; lASkm assuines that" 1^^) multivariate 

normal. It would be fairly simple to substitute some other e^asumption, 

as will be seen. If there are no restrictions on'^the parameters, the 

elements of oo are presently taken to be the usual parameters: the means 

(ix ) , variances (a., or a?) • and covariances (o ) of the random 
^ ^ 11 i' ' ID 

variables X^/X ^.../a . Naturally; the t are taken to ue the^ suf- 
J- c n ^ 

ficient statistics: sample means (m^) ; sample variances (s^^ or s^) , 

and sample covariances (s. .) . For asymptotic work, it is immaterial 

13 

whether the s are the unbiased o^ the usual biased estimators of the 
ID 

a . VJhen there are no restrictions, estimated 00 , denoted by o) , is 
ID 

identical with t ; o) and r are identical asymptotically. 

LASAiiT uses standard formulas for the c$/^v(t^,t^) required in (2) 
and (;). The standard formulas for the multivariate normal case, presently 
incorporated into LA.GA][T, are 

c/v(m. ,m. ) - ./U 

^ 1' D^ id' 

C5^v(s , ,s. .) = .a, . + a .a,.)/fj 

^^""^ gh' iD^ ^ gi hD gD hi"' 

C'Mm.,s^^) ^ 0 

where d denotes a consistent estiinatov of a covariance. 

■ When using lASAIPE, the statistician specifies k functions 
I 0,i 1^0 in which he is interested. He does this simply by 
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\vriting a PORTMN ax^ithmetic assignment statement for each function, 
expressing the corresponding i as a' function of t . He inserts these 
statements in a place provided in the program. lASAIIT proceeds automati- 
cally from this point , computing t , i 5 {?^) f {|^(t)) , H^l^/^^u'' ' 
WcM^ ;t )l! , C = lia- - 11, Q = Vc^-^l ; and finally the percentile 
rank of .Q in the appropriate chi square distribution. 

If f(x|a)) is not multivariate normal, it is only necessary to re- 
define the t^ as functions of the observations X and to replace (^) 
by correct formulas for' I|c9(v(t^^, t^)|j . With these changes, L^SAIIT can ^ 
proceed just as described. 

VJithout user action, the program accommodates two samples, each com- 
posed of any number of observations on a maximum of 10 random variables. 
More samples (up to 20) with fev;er random variables can be accommodated 
if the user sets all population covariances between variables from dif- 
ferent samples equal to zero (see beJow). In addition, the maximum of 10 
random variables per sample can be increased, if desired. 

Restrictions on the Paratnotors 

In the normal multivai-iate case, there is a total erf n(n 4- 
sample means, variances, and covariances. Unless instructed othervuse, 
LASAUT automatically uses these n(n \ 5)/2 sample statistics as estima- 
tors for the corresponding parameters in o) . 

If r restrictions are imposed on the parameters (for example, 
certain means or variances are known to be equal, or certain covariances 
are known to be %ero), then K , the number of parameters in o) , is 
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correspondinfjly reduced. If r restrictions are imposed in the normal 
multivariate case, then K - n(n •(• 5)/2 - r . vm'en r restric- 
tions are to be imposed, the statistician must arrange matters so that r 
of the estimators are functions of the remaining K independent estima- 
tors, to be denoted by S>^,(h^, • . • ,(i>^ • 

iie does this by inserting in the program PORTMN arithmetic assignment 
statements defining whatever estimators he ^^ishes. For example, if it 
is known that tv;o population means and ii,^ are equal, the statist!- 

cian would supply a P0RTRW1 statement defining t in terms of , ^3 > . 
and other estimators. If, for example, the sample sizes are equal, he could 
then insert in the program the FOKSm^ equivalent of the definitions ii^ = 

(m^ -Km^Va and - (m^ < tn^)/j . 

In this way the 1 and the Si,, are directly or indirectly defined 

as f-anctions of the m. and the s. , . It will be convenient to refer 

. to the m. ana the 3. . collectivel;>' as the T ; p = 1,2, 
n(n 'h 5)/^ • 

By (5); 



r> 0 ■ " 



Replacing the t^ in (^) by 



where c'/v(T^;T^) = C'M'S^,T!^) 
and using (^), v;e have by the ihain rule for differentiation 



03= CD 
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0^ d&_ Ooi, 

^a^b uv u vpq p q 



" p q p q 



(6) 



The c/v(Tp,T^) are given by (h). lASAHT conveniently obtains the a| ^ 



a^b 



from (6) rather than from (5)- 



Example I 

Suppose the statistician wishes to test the hypothesis | = -^32 ~ ^ 
under the restriction [i^ = u,^ j For the normal bivariate case, 

' 2 2 

It does not matter whether co is defined as ^'^'^^^i>^i2^^'d^ 
^{[x^^^o^yO ,a^} . The statistician supplies the FORTRAN definitions 



a. f s. -i- m. - [i. 

1 > i i 1 



and 
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MSAJIT proceeds automatically from this point on. In accordance 
with (6)^ Var ^ is computed, from the cov(T^;T^) given by (U). . 

The \L and a given abi^ve are v;ell knov;ri as the maximum likelihood 



e^cimator^ under the restrici^ion ^;|_ = * LASAin does not automatically 
obtain MLE. If the statistician does not have formulas for the MLE, he 
can use any consistent estimators in their place. In examjjle 1^ he could 
-without much loss have chosen S'^ >5 s" , = s^ ^ a^^i = s . If in- 
efficient estimators are used^ the test of' the l;Jypothesi'S is still valid, 
bub tho ] o'.'.er Z'f the bei't la reduced. 

Illustrative Problems 



LASAHT hac been checked out by applying it to numerical examples, 
besting soir.e three dozen diiTerent null h^r'potheses for which the numerical 
answers c^uld be verified. //The partial listing in Tables 1 and 2 may 
suggest the ccote of the j^rogram. Primes are used to distinguish parameters 
of bv/o different populations; p.^ •:- a. ./a. a. , \x E i\x^,\x^^' E , and 
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Example 2 

Consider testing the null hyi)othesis that the tetrad ^12^-^1^ " 
^iy^2h " ^ ^ yith the restrictions that ^ , a^^ ^ ^25 " ^2)| / 

(Ljrd^ in press). First, let us replace the null hypothesis by. the equivalent 
hypothesis that 5 = ~ ^15^2H ^ ^ * '^^^^ definition of the function 

5 Is provided to the computer by inserting in the program the arithmetic 
assignment statement 

/ 

XIIL^T(l) = LFOOTRAN equivalent of ^2^^ ' ^15^2i|-' 

It is further necessary to provide FORTRAN statements defining the 
a in terms of sam.ple means, variances • and covariances. We choose 

<^2 ^^2 / 2 2 \ 

\v'hich are the maximum likelihood estimators laiidcr the stated restrictions. 
2 2 

The estimators ^ ^ ^9 defined explicitly in the 

program^ with the result that .the comi»uter resorts to a default procedure 
that assumes (correctly; that s^^ , " s^g ; cJj^2 " ^12 ^ ^ 

Pi'ovided with the POKT:%iW definitions shown, the computer will now 

compute I , its estimated asymptotic variance a-^ , the test statistic 

l/S-j^ , and the percentile corresponding to the test statistic* All this 

^2 

is easier for the statistician than deriving the formula for --an 
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eighth degree polynomial containing ten terms involving seven statistics-- 
and then computing the test statistic from this formula. 

Example 3 

In a Monte Carlo study, 1000 values of > and their probability 

levels were computed by IA'SAHT, where 

^pj), - piP ^P3u - 4^ 

(Lord, in press). The time required on a 560/o5 for all 1000 was about 
80 seconds. 

Example U 

The last example in T^.ble 2 was carried tlirough for sets of data 
(Stroud, 1971b) having 8 observed variables in each of two groups. Thus 

there v;ere 88 different sample ntatiatics T involved in the double 

p 

summation in (o). The vector null hypothesis i that v;as tested con- 
sisted of k l8 separate equations of the form £ = 0 . Tn three 
separate applications, UVS/diT produced values for the test statistic 
identical to those obtained b:, Stroud using complicated analytic formulas. 

The hypothesis tested ana:, be described as a multivariate analysis 
of covariance h^^T^othesis with three criterion variables and/ five covari- 
ables, modified to take accouiit of random errors of measurement in the 
covariables. Problems of this complexity are very difficult to 
carry through without the aid of a program such as LASAHT. 
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